Skip to content

Conversation

joaosaffran
Copy link
Contributor

@joaosaffran joaosaffran commented Aug 12, 2025

This patch adds 2 small validation to DirectX backend. First, it checks if registers in descriptor tables are not overflowing, meaning they don't try to bind registers over the maximum allowed value, this is checked both on the offset and on the number of descriptors inside the range; second, it checks if samplers are being mixed with other resource types.
Closes: #153057, #153058

@joaosaffran joaosaffran marked this pull request as ready for review September 11, 2025 00:20
Copy link
Contributor

@inbelic inbelic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once we handle the update function

return make_error<InvalidRSMetadataValue>("ShaderVisibility");
}

static uint64_t updateOngoingOffset(uint64_t CurOfset, uint64_t NumDescriptors,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
static uint64_t updateOngoingOffset(uint64_t CurOfset, uint64_t NumDescriptors,
static uint64_t updateOngoingOffset(uint64_t CurOffset, uint64_t NumDescriptors,

Is this function required? I think we could just use the OffsetBound below in the same way as the front-end? Or we can make the function shared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this function to a common place and updated the frontend to use it. I wasn't able to get the correct behaviour by copying frontend's update function. The test rootsignature-validation-fail-appending-overflow.ll, was failing with it.
If you have something you would like me to try, please let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that the change to the frontend to use this function doesn't also include a test change - the frontend's original logic was simple saturation, and this behaviour is not the same as that. Are we missing test coverage in the frontend that would show this fixing a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the frontend has all tests covered. I spent sometime checking for equivalent errors for the tests I've added, here are the mappings I've found:

  • llvm/test/CodeGen/DirectX/rootsignature-validation-fail-appending-overflow.ll = RootSignature-resource-ranges-err.hlsl#append_to_unbound_signature
  • llvm/test/CodeGen/DirectX/rootsignature-validation-fail-offset-overflow.ll = RootSignature-resource-ranges-err.hlsl#direct_offset_overflow_signature
  • llvm/test/CodeGen/DirectX/rootsignature-validation-fail-register-overflow.ll = RootSignature-err.hlsl#basic_validation_0
  • llvm/test/CodeGen/DirectX/rootsignature-validation-fail-sampler-mix.ll =RootSignature-resource-ranges-err.hlsl#mixed_resource_table

@inbelic, let me know if you think otherwise.

Copy link
Contributor

@inbelic inbelic Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do it more like this: #153276 (comment).

To Justin's point though, I think we are missing a test-case for something like:

DescriptorTable(
  CBV(u0, offset = 2^32 - 3), numDescriptors = 4), // Overflows
  CBV(u1) // Appending to overflow
)

which would only report the first overflow and not the second because the offset will be truncated to 1 when converted to a uint32_t, when using updateOngoingOffset. It should report both when using the original frontend method.

Comment on lines 553 to 554
dxil::ResourceClass RangeType =
static_cast<dxil::ResourceClass>(Range.RangeType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cast does nothing - Range.RangeType is a dxil::ResourceClass.

Comment on lines 565 to 566
if (HasSampler && HasOtherRangeType)
return make_error<TableSamplerMixinError>(OtherRangeType, Location);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it deliberate that we report the last non-sampler range type rather than the first? I think the logic would be simpler here if we just tracked the current type and errored if we ever see a sampler when we have a current type other than a sampler. Then we could just do an early return and we need fewer bookkeeping variables to keep track of. Something like (untested):

  dxil::ResourceClass CurrRC = dxil::ResourceClass::Sampler;
  for (const mcdxbc::DescriptorRange &Range : Table.Ranges) {
    if (Range.RangeType == dxil::ResourceClass::Sampler &&
        CurrRC != dxil::ResourceClass::Sampler)
      return make_error<TableSamplerMixinError>(CurrRC, Location);
    CurrRC = Range.RangeType;
  }

  return Error::success();

Comment on lines 561 to 563
// Validation of NumDescriptors should have happened by this point.
if (Range.NumDescriptors <= 0)
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The continue is probably fine here for the reasons you describe, but why are we checking <= 0? Since NumDescriptors is unsigned, wouldn't checking == 0 make more sense?

Comment on lines 578 to 579
const dxil::ResourceClass &RangeType =
static_cast<dxil::ResourceClass>(Range.RangeType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another no-op cast

return make_error<InvalidRSMetadataValue>("ShaderVisibility");
}

static uint64_t updateOngoingOffset(uint64_t CurOfset, uint64_t NumDescriptors,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that the change to the frontend to use this function doesn't also include a test change - the frontend's original logic was simple saturation, and this behaviour is not the same as that. Are we missing test coverage in the frontend that would show this fixing a bug?

!5 = !{ !"DescriptorTable", i32 0, !6, !8, !9, !10, !11, !12, !13, !14, !15, !16, !17, !18, !19, !20 }
!3 = !{ !5, !21 } ; list of root signature elements
!5 = !{ !"DescriptorTable", i32 0, !10, !11, !12, !13, !14, !15, !16, !17, !18, !19, !20 }
!21 = !{ !"DescriptorTable", i32 0,!6, !8, !9 }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!21 = !{ !"DescriptorTable", i32 0,!6, !8, !9 }
!21 = !{ !"DescriptorTable", i32 0, !6, !8, !9 }

}
};

class OffsetOverflowError : public ErrorInfo<OffsetOverflowError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing needed for this PR as you're just following the established pattern, but we should probably come back and take a look at this error heirarchy at some point - I'm not really convinced we need this many distinct errors given all we ever do with them is print the error message (as opposed to handling specific errors in some way). We might be able to simplify these to something only a bit more structured than a StringError without any real loss to functionality or readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an issue to check this in the future: #159429

Copy link
Contributor

@bogner bogner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally good. One comment about the scope of a helper function.

Comment on lines 44 to 45
LLVM_ABI uint64_t updateOngoingOffset(uint64_t CurOffset,
uint64_t NumDescriptors, uint64_t Offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in one place now and it's pretty specific to that place. Does it really need to be a public function? Should we just make it static in the .cpp file or inline the logic where it's used since it's only in one place?

If we are putting it here, we probably need a doc comment to explain it.

Copy link
Contributor Author

@joaosaffran joaosaffran Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was originally intended to be used both in frontend and backend, but this is not needed anymore, so I've removed it in the latest commit

Copy link
Contributor

@inbelic inbelic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are accidently introducing part of the same bug that was discovered in the front end

@@ -0,0 +1,16 @@
; RUN: not opt -S -passes='dxil-post-optimization-validation' -mtriple=dxil-pc-shadermodel6.6-compute %s 2>&1 | FileCheck %s
; CHECK: error: Cannot append range with implicit lower bound after an unbounded range CBV(register=2, space=0).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically the wrong error message and the same bug that was in the front end. Which is why we added the separate variable to track unbounded to make it easier to reason about

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From offline discussion, I've updated the code to look similar to the frontend. I've also updated the error messages to make sure they are reflecting the proper error.

@@ -0,0 +1,16 @@
; RUN: opt -S -passes='dxil-post-optimization-validation' -mtriple=dxil-pc-shadermodel6.6-compute %s 2>&1 | FileCheck %s
; This is correct according to DXC: https://hlsl.godbolt.org/z/KPG5o74KE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
; This is correct according to DXC: https://hlsl.godbolt.org/z/KPG5o74KE
; A descriptor range can be placed at UINT_MAX, matching DXC's behaviour

@@ -0,0 +1,17 @@
; RUN: not opt -S -passes='dxil-post-optimization-validation' -mtriple=dxil-pc-shadermodel6.6-compute %s 2>&1 | FileCheck %s
; This test check if a resource is implicitly overflowing. That means, it is appending a resource after an unbounded range.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
; This test check if a resource is implicitly overflowing. That means, it is appending a resource after an unbounded range.
; This test checks if a resource is implicitly overflowing. That means, it is appending a resource after an unbounded range.

@joaosaffran joaosaffran merged commit c06f354 into llvm:main Sep 26, 2025
10 checks passed
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
This patch adds 2 small validation to DirectX backend. First, it checks
if registers in descriptor tables are not overflowing, meaning they
don't try to bind registers over the maximum allowed value, this is
checked both on the offset and on the number of descriptors inside the
range; second, it checks if samplers are being mixed with other resource
types.
Closes: llvm#153057, llvm#153058

---------

Co-authored-by: joaosaffran <[email protected]>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX HLSL HLSL Language Support
Projects
None yet
5 participants